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With student evaluations. of instructor effectiveness 
playing an increasingly isportant role in the detersination of serit 
pay, promotion, and tenure, there is a growing interest in what these 
evaluations actually seasure. Faculty members frequently voice doubts 
about using student evaluations, because it is not clear to what 
extent they measure the leniency of the instructors, the amount the 
instructors taught the students, or the perforaing ability of the 
instructors. Previous studies of the problem have not been 
satisfactory. This paper presents a sequential, three-equation acdel 
to determine the effects of learning and leniency on evaluations. The 
wariables includes previous knowledge of the concepts of the course; 
amount of previous course study; amount of related course study; 
previous academic average; academic year of the student; time the 
Class meets; size of the class; and sex of the student. The sodel was 
applied to students in 14 sections of the microeconomics protion of 
the "Principles of Economics" course at- the University of Western 
Ontario. The results indicate that evaluations do not depend on 
leniency. (LBB) 
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Leniency, Learning, and Evaluations 


<> 


1, Introduction 

with student evaluations of instructor effectiveness playing an 
increasingly important role in the determination of merit pay, promotion, ° 
and sabes: there is a growing interest in what these evaluations actu- 
ally measure. Faculty members frequently voice doubts about using student 


evaluations because it is not clear to what extent they measure 


‘the leniency of the instructors, the amount the instructors taught the stu- 


dents, or the performing ability of the instructors, ; 
1,1 Evaluations and Leniency 

‘Several recent studies have documented a positive relationship be- 
tween the grades economics students receive and the evaluations they give 
their instructors (Kelley, 1972; Capozza, 1973), Similar results — 
also been reported for other disciplines (Murray, 1972) and across various - 
disciplines (Nichols and Soper, 1972; Perry and Bauman, 1973; Reuber, 1974). 
These results are consistent with the view that instructors "buy!" high evalu- 
ations (and; they hope, higher — promotion, and tenure) by "giving" 
the students higher grades. This view suggests that there is at least a 
tacit collusion between instructors and students to scratch each other's 
backs, The results are also consistent, though, with several other be- 


havioural models. Students with higher grades may have given higher evalu- 


ations to their instructors because the instructors in these samples taught 


to the brighter students, 


Alternatively, it is possible that a positive cor- 
Telation between grades and evaluations could be observed if the better in- 
structors, who justifiably received higher. evaluations, taught their 
students more, 20 that their students justifiably earned higher grades, 

* Finally, the causation may be in the opposite ———— from that usually 

assumed, and "san instructor might grade a class harshly or generously 

because of the ratings he receives (or anticipates) ." (Doyle, 1974.) 

Many other studies have found no relationship between ‘Grades and 

evaluations, These studies are well summarized by Costin et al. (1 971) 

and Menges (1973), But as McKenzie and Tullock (1975) point out, — 

the lack of a correlation between grades and evaluavivas does: not — 

sarily lead to a rejection of the hypothesis that more lenient instructors 

receive higher evaluations, If instructdrs Secome more lenient in ‘the 

hopes of receiving higher evaluations, the students may respond simply by ‘ . 

studying less, and learning less, yet receiving no lover grades. This 

phenomenon is particularly likely if students value additional leisure time “E 

highly and are satisficers yith respect to grades. As a result, the use | 

of grades, pneornreted for the knowledge obt aiueo by the scudents, as a 

measure of instructor leniency may be quite misleading: » 

1.2 Evaluations and Amount Learned 
Attempts to measure the edlatioashts* becwean learning and student 

evaluations of instructor ettectiversen have yielded mixed results. Capozza 

(1973) reported a negative and significant relationship between evaluations 

and the amount ——— but he has since then indicated to us by £8 

correspondence that with a larger sample his results are no longer statis- 

sically significant, Besides using — as a measure * leniency, which 


we have already suggested may be inappeapelaté, Capozza also failed to include 


any variables in his model to explain why some students might ‘learn ~~ 


more than others, Rodin and Rodin (1972) also found a significantly nega- 
tive relationship between evaluations and the — learned, but their . 
study has been found lacking in several respects (see Frey, 1973 and 

Eble, 1974) including small sample size and omitted variables. 

Crowley and — (1 a) found a pr‘ itive but —————— re- 
lationship between some — of evaluations and the amount students 
learned ‘in beginning economics, “Significantly positive relationships 
have been reported by Gessner (1973), Frey (1973), and Doyle and Whitely 
(1974), 

, It appears from the studies which have previously been conducted 
and from the criticisms leveled at them that the — have been’ clouded 
by rhetoric and by the complexity of the splattonshipe- What is needed is 
a model which measures, ‘first, the impact of the instructor * the amount 
his students learn, correcting for other possible influences on learning. 


Second, the model rst measure the leniency of the instructor, correcting 


‘for other influences (including the amount learned) on students’ grades. 


And third, it must relate these measures to students' evaluations of 


instructor effectiveness, correcting for other possible influences. What 


-is needed, then, is a sequential, three-equat ion model to determine the 


effects of learning and leniency on evaluations. We turn now to our develop- 


ment of such a model. 


2 Specification ( 


2.1 Impcrtant_ Variables 
The knowledge of economic creeaete (KNOW) gained by a student in the 
microeconomic portion of a beginning economics course depends on many things, 


most of which are quantifiable. A list of tiene factors includes: 


(1): 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


+2. 


Previous knowledge of economics concepts (PRE). Students 


knowing more economics at the beginning of a course may 

well know more than others at the end of the course, though 

they may not learn as much new material during the course. 
Q 


Amount of previous economics (PE). If the student has had 


an economics course previous to this one (perhaps in high 
school or perhaps a university course which he or she failed), 
we would expect the student to’ know more at the conclusion of 
the course. 


Amount of calculus taken by the student (CALC). Because 


much of microeconomic theory explicitly or implicitly 

deals with differentiation and integration, students with 
a calculus background may learn the concepts more easily — 
than students without a calculus background, The amount of 

calculus in a student's background may also be a proxy for 

analytical and mathematical aptitude. (The latter was 

found by Crowley and Wilton. (1974) to have a significantly 

positive. effect on the amount of economics learned by stu- 


dents in beginning courses.) 


Previous academic average (AA). Students who have done well 


in the past in terms of their grades tend to continue to do 


well, either because of high aptitude or because of high. 


motivation.. Ability to take tests is a skill in itself; 
high academic average is, in part, a reflection of this 
ability. Because academic averages it, secondary schools are 
probably not commensurate with academic averages for upper- 
class students at university, we have split AA into two 
parts: AAF to represent previous academic average of first- 
year students and AAU the previous academic average of 
upperclass’ students, 


Academic year of the student (Y). If upperclass students 
are more mature than first-year students they may learn more 


in a course, It is also possible, however, as suggested by 
Crowley and Wilton, that upperclass students view a beginning 
economics course as one which deserves less of their atten- 
tion and effort, so that they learn less, Also, students who 
postpone taking their first economics course until their 2nd 
or 3rd year in university may have less aptitude for it than 


‘first-year students. 


Time the class meets (T). Students might leara more in 
classes meeting at certain times of day than they would 
from classes meeting at other times of the day. 


Size of the class (SZ). We include this variable to see if 
class size actually affects learning. 


(8) - Sex of the student (FEM). Crowley and Wilton found that 
female students learn significantly less in a beginning 
economics course than male students do.: Their measure of 

amount learned, however, was biased against students 
beginning the course with less knowledge of economic con- 
cepts, so that if females began.a course knowing less «: 
economics and improved their knowledge by the same absolute 
amount as males did, then the Crowley and Wilton measure 
of amount learned would yield their result spuriously. 

We are including a dummy variable for females to determine 
whether the knowledge a student has of economic concepts in 
terms of absolute raw scores varies with the sex of the 


student, ceteris paribus. 


Students in fourteen sections of the microeconomics portion of the 
Principles of Economics course at the University of Western Ontario were 
given a 19-question multiple-choice examination at the beginning of their 
first class in September, 1974." This examination is the "pretest." The 
examination was administered by —* not teaching the course and instruc- 
tors of the course were not permitted to see the —— on the examination. 
Examination questions were designed to test students? mastery of economic 
concepts rather than of economic jargon. This pretest serves as our measure 

. of PRE, a student's previous knowledge of economics. The same examination 


was given to these students under examination conditions at the end of .the 


term in December. This "post-test" is used as our measure of KNOW, a student's 


current knowledge of economics.” 


‘principles of Economics is taught at U.W.O. in many sections, with an 
average enrolment of about 60 students per section. 


The examination we used was a slightly modified version of the micro- - 


economics portion.of a test, similar in nature to the American TUCE but better 
suited for testing Canadian students, designed by Crowley and Wilton (1974). 

We eliminated some questions found to be fairly weak indicators of student 
knowledge by Crowley and Wilton and added a few questions to cover omitted 
material we felt ought to be included. A copy of the exam is available from 
the authors. 


2seudents who dropped the course were omitted from the sample, as were 
those who changed sections. Our final sample included 617 students. 
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6. : 
After the "post-test" was administered, we asked the instructors 


' to indicate the degree. of correspondence between the material covered by 
the "post-test" and material covered in class, This correspondence was 
found to be uniformly high for all sections, so that we are fairly con- 
fident that our test measures areas of knowledge covered in all sections 
in the senple.” , 

Students in sections in dntch-moltiple~choice testing is used regu- 
larly throughout the term may not know more economics than others in their 
cohort but may simply have had better — in — econ rics multiple- 


choice exams and may therefore do better on our tests. It seems appropri- 


ate to control for this possibility by “including an additional variable, viz.: 


(9) Previous experience with mltiple choice questions 
in economics PET), 


2,2 "Knowledge" equation 
In order to gauge an instructor's contribution to student know- 
ledge of economics, we estimated the following equation: 
Equation? i= student, j = section 


KNOW, 


tj s a, + a, PRE, + My PE ss + @gCALC, « + @,,AAF 


ij + GgAAU, 


+ 
+ ae¥ ay + oT + a, S27 GgFEM, , Oy qMULT, 5 


yy INST j + UeNOW 


—F instructor whose class material differed significantly from that 
covered on the "post-test" may have taught his students as much economics as — 
did other instructors but his students would not have done as well, ceteris 
paribus, on ‘he post-test, The uniformly high degree of correspondence 
between post-test and material covered in class is therefore reassuring. 

In large measure, this result is probably due to the use of a common text 
and reading list in this course, 


= 7 


with the exception of INST, the variables in this equation have been de-_ 
fined above, In the estimation, we have treated the variables as dummies-- 
the precise definition of these dumuy. variables is given in Section 3. 

INST is a set of dummy variables, one each for all but one instructor 
who serves as a kind of "numéraire". The set of estimated coefficients, 
uy thus gives us an estimate.of the contribution of each shattiseee * 


students’ knowledge, relative to the contribution of the omitted teacher. 


A high value of ay will be associated with an instructor whose con- 
tribution to student knowledge is relatively great, while an instructor . 


o 


with a relatively small contribution will have a low ayy ‘ 
2.3 "Leniency" equation 

in order to determine the extent of an instructor's leniency in 
assigning grades to students, we must control for variables other than 
leniency which may affect each student's grade, Aside from instructor 
leniency, the grade a wepdent facnives: (GRADE) will depend on the variables 
(2) through (9), defined in Section 2,1, as well as on the amount that the 
student knows, which we measure by XNOW. Consequently, we estimated the 


following equation: 


Equation 2 


— 


GRADE, ; = By + By PE, + ByCALC, ByAAF i + B,AAU, + BsY sy + BT, 


+ B72, + BeFEM, , + BgKNOW, + By hULT 4-6, , INST, + UoRADE 
In this equation, the set of estimated coefficients By play a role ana- 
logous to that of ay in Equation(1). Here the coefficients of INST provide 
a measure of the relative leniency of each instructor, net of the leniency 


of the numéraire teacher, High values of Ry will be associated with - 


*orades are assigned on a numerical scale with 100 as the. maximum, 


relatively more lenient instructors.> : @ 
Because instructors and other variables are aayecten to have an 

’ impact on students’ knowledge, including ‘these same vattablas along with KNOW 
" (measured by the post-test scores) in the regressions Muy create problems 


of mi1ticdl linearity and bias the estimated coefficients. An alternative 


specification of Equation 2 is to substitute for KNOW from Equation 1: 


& 


Equation 2A 1 rs 


SPADE, = (8, + By %, dat (3, + By a, )PE 


ij ij 


+ (B, + By %, ) AAF 


+ (3, + By a, )CALC 


ij ij 
+ (B, +B, o, )AAU,, + (8. + Bo, DY 
4 Re tg ee ee ay 


+ (Bg + 8g 0,)T, + (87 + By My SZ, 


J 


+ (Bg + Bg Oy )FEM, + By 0 PRE, 


+ (819 + Bg Ojq )MULT + (84, + Bg My J) INST 


ij J 


+ Oforae * 89H raw 
é By .can be estimated by dividing the coefficient of PRE by a, from Equation 1. 
With this estimate of Bg and the estimates of the a's, the ‘Femaining B's can 
be disentangled. 


RS 
In addition to providing us with information concerning instructor 


leniency and contribution to student knowledge, Equations (1) and (2) can be 


’ 


Svnat we are really interested in, of course, is the students’ 
perception of instructor leniency. Because perceived leniency may not be 
closely related to final grades in the course, in estimating Equations (2) 
and (2A) we used each student's grade in the course just prior to the time 
the evaluations were conducted. The teaching evaluations were carried out 
approximately two-and-a-half weeks prior to the end of the term's lectures. 
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used to see whether variables such as sex of atuitenks —— time of 
class, student's year, etc. have effects cn ———— knowledge of economics 
different from--perhaps evan onesatie tosstheis impact on the student's 
grade in the course. 
tis _ 2.4 “Evaluation equation ° 
We can use the estimated coat eictente a, and By from Equations q) 
and (2) to axploxs the relative finportance of the — 8 teaching abil- 
ity and the average leniency of an instructor in determining the student evalu- 
ation of that instructor. The evaluation. quedsfonnaire included an "overall ~ 
effectiveness" question: "How would you vate ei instructor in terms of gen- 
eral, overall effectiveness as a teacher?" Students were asked to give their 
ratings on an integer scale ranging from 5 ("outstanding") to 1. ("Poor") .° 
It would be most desirable, for the purposes of our experiment, to : 
identify each student's evaluation of his instructor with the students own — 
,nouvledge and grade. Unfortunately, this was not possible, because the — 
ations were done anonymously.’ As a result, we were forces to use — 
averages for our regressions invotving student BA— of the — 


These section averages are denoted by E Our third equation is:. 


Equation 3 


* 


The independent variables in this equation are the estimated cogt ficients 


on contribution to learning (from Equation (1)) and instructor leniency (from 


iJ — 
Sthe gradations are: 5 Outstanding 
\ -  &e Very good 
© * — 3- Good 
; y 2- Satisfactory 
1- ‘Poor 


Ay the past, students at U.W.0., fearing reprisals from their 
_Anstructors, refused to identify themselves with their student numbers on 
⸗ “evaluation forms. This resulted in a high incidence of invalid responses, 
and the solicitation of student number. 1't* abandoned in 1974. 


ee 
“ ° 
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; aeenthon (2), or (2A)). zs should” be noted-thatin-this study ve are not oe Fane SY 


attempting to explain ali the factors that go into the determination of student . 
evaluations of instructors. Our aim is more modest. Tne estimates of 

Equation (3) will — whether or not the amount taught to students by 

an instructor and the instructor's leniency in "handing out" grades have a 
statistically significant influence on student ratings of instructors and, 


if so, which effect is stronger. 


3. The Results 
_ The model described in the previous section was estimated using 


ordinary least squares. In this section, we discuss these results, focusing 


_ first on the estimates of Equations 1, 2, and 2A and then on Equation 3. ' 


3.1 Knowledge and Reward © 


Our estimates of Equations 1, 2, and 2A are presented as Regressions i 
2, and 2A, respectively. In-these regressions, all the independent variables 


£ 


are entered as dummy variables, whose definitions are given in Table 1. We 


: 


rd 


believe that the results proccae some_interesting information about the: factors é 
influencing a student's knowledge of economics at the end of a semester of micro nar 
principles and the grade a student receives. Since the regressions have most of 
their explanatory variables in common, it ‘seems natural to discuss the results 
in terms of the impact of each of chess variables. 

Previous economics. It appears chee having. had on economics course prior : 


to the college principles course has at best no effect on a student's knowledge 


or his grade in the principles course. Having had previous economics may even 


' have an adverse effect on both KNOW and GRADE. In Regression 1, the coefficients 


on PE2, PE3, and PE4 (student had some previous economics) are all negative 
but insignificant, while in Regression 2 the coefficient of PE! (student had no 
previous economics) is positive and significant at the 10% level. Since nearly © 


all of those students who say they have had "economics" prior to the principles 


. course had such a course in secondary school, these results may shed some Light 


a 


AAFA(AAUA) | 


a" 


50 


Teble_1 


Definitions of Variables — 


velue of variable = 1, if... 


no previous economics course 

one previous economics course, passed 
one previous ‘economics course, failed 
more than one previous economics course 


no previous calculus course 

one term of previous calculus 

two terms of previous calculus 

more than two terms of previous calculus 


~~” AAFB(AAUB) — 


AAFC (AAUC) 


AAFD (AAUD) 


Y2 
¥3 


FEM. 


previous academic average of B, freshman (upperclassman) 
previous academic average of C, freshman (upperclasena’ 
previous academic average of D, freshman (upperclassman) 


previous academic average of A, freshman —— 
i 


4 


first-year student 
second-year student, : 
third-year student and other 


1 = female student, O = male 


classroom tests and assignments < 25% multiple choice 
“classroom tésts~and assignments 26-50% multiple choice 


or fewer correct answers on pretest 
correct answers on pretest 
correct answers on pretest | 


fFwn 


coe 


* 
' 


K correct answers on pretest (3 < K < 12) 
1 


2 or more correct answers on pretest 


6 or fewer correct answers on post-test 
7 correct answers on post-test 


N correct answers on post-test (7 < N < 16) 


e 
° 
e 


16 or more correct .answers on post-test 
student's course grade just prior to the evaluation 


score on post-test, 0-19 


13 


¢ 12 


Regr ession | (standard errors in parentheses) 
KNOW = 7.04 = 0,228 PE2 - 0,429 PE3 - 0.437 PEG - 1,23 CALC2 + 0,207 CALC3 
(.889) (265) (2.78) (.518) (.535) (,244) 
+ 0.377 CALC + 2.27 AAFA + 0.804 AAFB + 1.92 AAFD + 1.06 AAUA 
(.377) (.351) (.266) (1,00) (740) 
+ 1.65 AAUB - .018 AAUC + .620 AAUD + .070 Y3 = 0.176 FEM 
(.475) (.478) (. 888) (.581) (.253) ps 
12 14 
- 0,166 MULTI - 0.064 MULT2 + .£ a, PREK + EX by INST, 
-(.372) (.355) k=3 kee 
R? = 0,311 
a, = 0.665 (.788) a, = 2.06 (.699) 
4 7 0.661 (,725) ay = 3.06 (.734) J 
a, = 1.33 (.797) 9 7 3-36 (.776) — 
“4° 1,37 (.690) 1 7 4.28 (.921) 
a, = 2,02 (.716) a), 7 5-42 (.875) : 
° | 
b, = 0 (omitted instructor) P bs = 1,30 (,566) : 
b, = 1,52 (.524) ze by = 0,632 (,582) 
b, = 0.609 (.587) * big = 3:00 (. 517 
b, = 1.02 (,540) by, = 0.258 (, 542) 
: t 
be = 2,17 (607) J by = 1434 (.542) 
* 
be = 1.42 (,615) by5 = 1.84 (.663) 
* by = 2.23 (,699) by, = 1.83 (.596) 
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Regression 2 (standard errors in parentheses) . : 


GRADE = 52.00 + 1.78, PE] - 4,20 CALC] + 10.76 AAFA + 3.22 AAFB - 4.56 AAFD 
(2,89) (1.08) (. 968) (1,53) (1,13) (3,96) 
+ 14,06 AAUA + 6,56 AAUB - 3.69 AAUC + 1.42 AAUD + 2.31 ¥3 
(3.10) (2,00) " (2,00) ° (3,74) (2.45) 
16 14 
. 26 FEM.- 1,68 MULTI - 1,38 MULT2 + Ec, KDUM + Z d) INST, 
(1,06) (1,56) (1,50) n=? no k=2 
R? = 0,404 
c, = 0.208 (2.52) Cy9 = 10,29 (2,22) — 
c, = 4.93 (2.30) | cpg 7:11.34: (2,22) 
Cy = 4,69 (2,23) . Cyy, 7 12,22 (2.60) 
Cyo™ 5-36 (2,16) Cy, 7 16.80 (2.50) — 
ejnꝰ 9646 2.299 —* 20.05 (2.68) 
qd, = 0 (omitted instructor) d, = 3.41 (2,37) 
d, = 3,14 (2,22) dy = 9,36 (2,44) 
d, = 3.61 (2,48) ‘ dyq ™ 0.039 (2.24) 
d, = 5.57 (2,28) dy, 710.97 (2.27) 
d, = 3,64 (2,59) : dy = 5-01 (2,30) 
d, = 2.81 (2.60) d,4 7 5-47 (2,82) . 
d, = 0,944 (2,24) diz, 7 0.492 (2,54) 


z | Regression 2A (disentangled coefficients) 

4 : GRADE = 35.18 + 1.84 PE1 - 4.06 CALC] 
+ 8.90 AAFA + 2.48 AAFB - 5.04 AAFD 
+ 12.11 ANUA + 4.91 AAUB - 3.86 Aauc 
+ 0.34 AAUD + 2.23 Y3 + 1.58 FEM 


- 1.53 MULTI + 1.50 MULT2 + 2.43 (estimated knowledge) 


— 
ks2 ST, 


r? = .303 for the estimated equation 
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on the teaching and learning of secondary-school ——— A student may 

take a high scthol course that is billed as an economics — but which, 

in fact, bears only a vague resemblance to the course he encounters in college. 
The resemblance is not strong enough to help the student per form better in the ~ 
“ . college course and may even result in confusing him. A related possibility is 
that the student is taught a principles course in high school and is: taught 
badly. Alternatively, a student may arrive in the college course with some 
knowledge but a false sense cf having already mastered the material. In 


either case, his performance in the college course would be adversely affected. 


Academic average. Students with academic averages of A or B (peter. 
to enrolling in the principles course) do significantly better both on our 
post-test and in the principles course than those with lower —— 

Upperclass A and B students appear to get higher grades than freshmen in 
their Section with similar knowledge and — background. “This is prob- 
ably due to the higher standards in university (an A average in college gen+ 

° ' erally represents somewhat better performance than it does in secondary 
school) * to the greater experience upperclass students have in taking 
college-level exams, Somewhat surprising is the insignificant coefficient 
of AAUA in Regression 1--upperclass students with an A average do not know 
significantly more economics at terms end than do freshmen with a high-school 


C. Yet the more senior A student can expect’.a considerably higher grade in 


the course than a first-year student in his section with a C average and the 


\ 


Bometh g of a puzzle is the positive ana significant (at the 5% 
: level) coefficient of AAFD in Regression 1, We have no entirely convincing) 
. explanation why “freshmen coming in with a D average should do-2 points 
better, ceteris paribus, on the post-test than those in their cohort with 
: aC average, Perhaps, being underdogs, they try harder, In any case, those 
: in the AAFD category represent a very small fraction (1.3%) of our sample, 
This result may therefore be due to extraordinary performance by two or 
three students, : 
17 


16 


same knowledge! Ability in writing college-levei exams appears to be 
handsomely rewarded, 

Sex_of student. An interesting non-result is the fact that male 
and female students of like background do not differ significantly in 
their performance either on the post-test or in the course itself. Con- 
trolling for pre-test performance and academic background, as we did in 
Regression 1, gives a negative but quite insignificant coefficient on FEM. 
Similar control in the GRADE equation produces a small, positive, and 
again quite insignificant coefficient on FEM.° 

Calculus background. Students who have had no calculus—course———_— 
do slightly (but statistically significantly) better on our post-test than. 
do students who have had a.term of calculus, ceteris paribus. Those 
with even more ‘calculus background do ne significantly more economics 
at the end of the micro term of principles than do students without any 
calculus, On the other hand, ‘the lack of a calculus background does work 
to a student's detriment*when it’ comesto performance in the principles 
course (cf.:. negative coefficient of CALC! in Regression 2), 

Our post-test -attempts to measure primarily knowledge of and ability 
to deal with basic ceria concepts and does not reward analytical ability 
per se. Lectures and course tests, on the other hand, may be more directly 
concerned with the manipulatton of tools of analysis and hence reward more 


te might be noted that all but one of the instructors 


in our sample are male, while 28.9% of the students are remale, 
3 


@ 
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highly those who have greater eos to calculus--even though calculus 
was not explicitly required in handling the problems. Although those 
without calculus background appear to have. at least as good-- possibly 

even better--knowlcdge of economic concepts as their more numerate 
classmates, they are at a disadvantage in the course exams and assignments. 


” Time and size of class. These variables were dropped from the re- 9 


gression by our regression package, (None of the coefficients associated 
with any of the time and size variables was significantly different from 
zero at the 99,999% level.) Neither student knowledge nor grade are 
affected by the time of day that a class meets or whether — 
held in one- or two-hour meetings. 

Pre-test and post-test. Students who enter the principles course 
knowing some economics do significantly better on the post-t@st than those 
who * very little at the etext, (This can be seen in Regression 1 on 
from the coefficients of PRES through PREI2 as compared to those” of PRE3 


and PRES. The coefficient of the omitted dumy variable pre? is, ‘of course, 


zero.) The gap between these two groups narrows by the term's end. 

Other things equal, a student who scored 12 or more correct answers on the 
pre-test can be expected to do ont) about 5 points better on the pust-test 
than a student who had correctly answered only 4 or fewer questions on 

the pre-test. 

Course grades appear to be fairly well related to student kaculetas, 
even when factors related to student background and instructor leniency are 
controlled for. This can be seen from the coefficients of KDUM in 
Regression 2. (KDUM is the dummy version of the KNOW: legerinbles See 
Table 1 for definitions.) Students who scored less than 8 correct 


answers on the post-test do significantly worse in the course than 
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those whose knowledge is greater. At the — (c£,: coefficient 
of KDUM16), the difference in grade can be as great as 20 * 
Instructors and knowledge. The coefficients of INST in Regres- 
sion 1] give us our measure of instructors' contribution to student know- 
ledge, The numdraire (omitted) instructor is INSTI; since all other b's 
are positive, his is the least contribution, The contribution (or value 
added) of instructors 3, 9, and 11 is not significantly greater than his, 
At the other end of the scale is instructor 10, whose students can be 
expected to score three points higher on the — than students of 
— 1, even when possible differences in class gomponseion: etc, 
are controlled for. (A difference of three points on a pinetvenouuesties: 
test is quite substantial; recall that the difference between the overall 
post-test mean score and the overall pre-test mean was about 4,3 points.) 
‘Instructors and leniency, Instructors appear to differ substan- 
tially in their liberality in grading, From the coefficients of the INST 
variables in Regression 2, we note that INSTI], the reference instructor, 
is the toughest grader. Several instructors are not significantly more 
. lenient than he is. But a student of a given background, dich a eiten 


level of knowledge of economics, can expect to receive a grade from five 


to eleven points higher from other instructors." 


Orne coefficients of KDUM fall into several groups. Holding other 
factors constant, post-test scores of 8-10 result in a percent grade 
about 5 points higher than post-test scores below 8. Post-test scores of 
11-14 are "worth" about 10-12 extra percentage points, while post-test 
scores of 15 or better yield an extra 17 - 20 points in grades. 


gome interesting sidelights: The instructor with the greatest 
value added (INST10) is one of the least lenient, while the instructor 
with the least value added (INST1) is also one of the least lenient. The 
most Yenient instructor (INST11) has a value added not significantly greater 
than that of the reference instructor. 
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The disentangled coefficients of Equation 2A are presented as 
Regression 2A. While there are some slight differences between the 
coefficients of Regression 2 and Regression 2A, these appear to be 


negligible. Even though several of the independent variables are 


‘statistically significant in explaining knowledge, the anticipated 


problem of multicollinearity seems small, perhaps because these vari- 


“ables explaia only about 29% of the variation in KNOW. 


3,2 Value added, leniency, and evaluations 


Having arrived at measures of each instructor's contribution to 
students! knowledge and his leniency in grading, we are now in a posi- 
ne £6 convent the central question of this study: To what extent 
are instructor leniency and "value added" — by high evaluations? 


Our measure of contribution to knowledge (CONTRIB) is the set of esti- 


‘mated coefficients {bys***by 4) from Regression 1; our measure of leniency 


(LEN) is the set of estimated coefficients {dyse+rsdyy) from Regression 2, 
When E, the section mean responses to the “overall effectiveness" question, 


is regressed on these variables plus an intercept term, the result is: 


Regression 3A (standard errors in parentheses) 


E = 3,37 - 0,124 CONTRIB - 0.086 LEN 
(0.465) (0,216) (0,55) 


rR? = 0.186 


Both cbdefficients are quite close to and not significantly different. from 


zero, Apparently, neither leniency in grading nor contribution to students’ 
knowledge has appreciable influence on what students consider "effective 


teaching", In order to correct for what may have been a subjective response 
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to estimate the leniency of the instructors. 
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a? 


by students to instructors with a foreign (i.e., non-North American) accent, © 
we estimated Equation 3, including a dummy variable FOR (whose value is 
one for instructors whose mother tongue was not English) : 


Regression 3B (standard errors in parentheses) 


E = 3.37 - 0.084 CONTRIB - 0.020 LEN - 0.870 FOR 
(.328)  (.153) (.043) (.250) - 


2. 0.632 


‘Although inclusion of FOR substantially improves the fit of the evaluation 


equation, the impact of CONTRIB and LEN becomes even smaller. '? — 


These results, along with various other tests of the robustness of 
Regressions 3A and 3B, 12 suggest that in evaluating an instructor's "overall ; 
effectiveness" students are not primartiy (or even strongly) responsive 
either to the instructor's ability in developing —— knowledge of economics 


or to the severity of the instructor's grading of — * performance. 


lhe results using the coefficients from Regression 2A rather than. 
Regression 2 are essentially no different. 


Regression 4A 
E= 3.36 - 0.187 CONTRIB - 0.084 LEN 
(0.456) (242) (0.055) 


R? = 0.175 


Regression 4B _ 
E= 3.38 - 0.103 CONTRIB - 0.025 LEN - 0.860 FOR 
(0.318) (0.170) (0.042) (0.242) 


R2 = 0.635 


other independent variables which might influence student ratings of 


instructor effectiveness are the teaching experience and the sex of the 
instructor. We reyan Regressions 3A and 3B with variables accounting for each 
instructor's total previous teaching experience, previous principles experi- 
ence, or the square roots of each of these, with no changes in the results 
reported above. Size of class and the time the class met were also insignifi- 


cant. We could not include a. dummy variable for sex of the instructor because 


we had only one female instructor in our sample. The results were also ’ 

unchanged when we dropped INST(1) or INST(11) (both outliers in ‘some sense) or. 
all instructors with a foreign accent from our sample. None of the instructors © 
in our sample was French-Canadian, and none had British, Irish, or Australian 
accents. There was also, no change in the results. “sn used final course grades 


@ 


4. Concluding Remarks 


If, as our results indicate, evaluations do not denen on lentency, 
why have some other studies found a positive relationship between grades 
and evaluations? Presumably this observed relationship in these studies is 
not proxy ing for a positive relationship between learning and evaluations, J 
since this relationship also was not borne out in our study. Two possi- 

* bilities immediately come to mind: (1) the students in different studies 
are not random samples from the entire population of students; (2) in the 
* studies which used individual data instead of section averages, the observed 
results may be picking up the possibility that those instructors taught 
primarily to the irigtiter students (who consequent ly received higher grades) . at 
Such behaviour would have been masked by our use of section averages. 

We would like to stress that we have not attempted in this study to 
capture all of the factors that determine evaluations; we have not attempted, 
in other words, to estimate the equation that best predicts E. What is 
belie measured by student evaluations of teaching effectiveness remains an 
open question and a disturbing one. Our findings lead us to believe that 
students evaluate’ instructors on the basis of fairly subjective feelings 


which are not related in any direct way either to the grades they receive or 


to how much they ‘learn from the instructor. High ratings for "effective 
teaching" may thus go to instructors * have good rapport vith students, 
who show "concern" for students, or rere a pleasent auroes atmosphere. 
First year students (who comprise 81.2% of our sample) my be particularly 
sensitive to instructor characteristics which help make their transition 


eo, from high school tu university less painful. Such characteristics may. 


bear little relationship to leniency in grading or ability to convey knowledge 
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4 : 
: ap 4 
of the subject. 


This kind of student response 16 ietént with the notion that 
university attendance is to a large extent a consumption activity. 
Students rate highly those — who provide a high quality of « 
the consumption good. In revarding instructors with high evaluations, 
university administrators may not be rewarding the — teachers (if 
teaching is taken to wean. contribution to student knowledge) but are pro- 
‘viding incentives for instructors to develop whatever characteristics 
go into —— the consumption good. It is hard to see how such an 
_incentive system could help build or maintain great universities. In 
times of ‘sagging enrolments (and the atteitant financial crunches), 
ROVER the ——— appeal of anche 4 a reward structure may be irre- 
sistible. . 

While we place a great deal of confidence in our results, we 
should emphasize that they have been obtained from-one beginning course 
in one department in one university, The results might be different 

for a different department, for students — an upper-level course, : 
or for different types of students at different universities. We strongly 
suspect that replications of this experiment wil! yield similar results, 
but we encourage those interested in pursuing the — further to 


adopt the approach we have used and to measure learning and leniency as 


accuratély as possible, 


(9 To the extent that upperclass students have made the adjustment to 
university, we would expect them’ to respond somewhat differently. If 
‘evaluations were available on an individual student basis tests of this 
hypothesis would be most interesting. 


° 
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